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Abstract 

We consider the problem of the estimation of the invariant distribution function 
of an ergodic diffusion process when the drift coefficient is unknown. The empirical 
distribution function is a natural estimator which is unbiased, uniformly consistent 
and efficient in different metrics. Here we study the properties of optimality for 
an other kind of estimator recently proposed. We consider a class of unbiased 
estimators and we show that they are also efficient in the sense that their asymptotic 
risk, defined as the integrated mean square error, attains an asymptotic minimax 
lower bound. 
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1 Introduction 

We consider the problem of the estimation of the distribution function F{x), x e M by 
the observation of a diffusion process {Xt : < t < T}. We suppose that the process Xt, 
t > possesses the ergodic property with invariant measure // and F{x) — //{(— oo, x]}. A 
natural estimator for F[x), a; G M, is the empirical distribution function Fx{x). It is well 
known that this estimator is uniformly consistent by the Glivenko-Cantelli theorem and 
asymptotically normal (Kutoyants, 1997). The problem of the asymptotically efficiency 
of the empirical distribution function has been considered for different model and differ- 
ent metrics. For the model of independent and identical distributed random variables 
the empirical distribution function is asymptotically efficient, in a global framework, in 
the sense that its integrated mean square error attains the lower bound given for all the 
estimators of the distribution function. Such result has been established earher by Levit 
(1978) and Millar (1979) using the theory of local asymptotic normahty. Gill and Levit 
(1995) obtained the same result using a different approach based on a multidimensional 
version of the van Trees inequality. The same approach introduced by Gill and Levit was 
successfully applied in Kutoyants and Negri, (2001) to prove that the empirical distribu- 
tion function is asymptotically efficient in the problem of invariant distribution estimation 
for ergodic diffusion processes. For the same model Negri (1998) has proved the asymp- 
totically efficiency of the empirical distribution function when the metric utilized in the 
risk function is based on the sup norm. 

Recently (Kutoyants, 2004) a class of unbiased estimator for the invariant distribu- 
tion function has been introduced. These estimators, that do not contain the empirical 
distribution function as particular case, are consistent and asymptotically normal. In this 
work we prove that they are also asymptotically efficient in the sense that their integrated 
mean square error attain the lower bound given for all the estimators of the invariant dis- 
tribution function. As in the case of the estimation of the invariant density, we have 
many efficient estimators, so the problem of finding the second order efficient estimator 
arise naturally (see Dalalyan and Kutoyants 2004 where the problem is considered for the 
invariant density estimation). This problem it is not considered here, but it will be an 
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argument of future researches. 

The note is organized as follow. In the next section we present the statement of the 
problem and the assumptions. In Section 3 we present the lower bound for the risk. In 
Section 4 we prove that the class of unbiased estimator attain this bound and finally, in 
Section 5, we give same examples of such estimators. 

2 Preliminaries 

In this section we introduce the model and its first properties, while the statistical problem 
will be presented in the next section. Let us consider a one dimensional diffusion process 

dXt = SiXt)dt + a{Xt)dWt, Xo, t>0 (1) 

where {Wt : t > 0} is a standard Wiener process, and the initial value Xq is independent 
of Wt, t > 0. The drift coefficient S will be supposed unknown to the observer and the 
diffusion coefficient will be a known positive function. Let us introduce the condition: 

SS. The function S is locally bounded, the function cr^ is positive and continuous and 
for some A > the condition xS{x) + cr(x)^ < A{1 + x^), x G M holds. 

Under the condition £S the equation has an unique weak solution (see Durrett, 
1996, p. 210). To guarantee ergodicity we introduce the following condition: 

TZV. The function S and a are such that: 

Ks(a^) = / exp < — 2 / dv \ dy ^ ±cxd, as x ^ ±oo 

Jo I Jo (^{^r J 



G{S) = I — exp <; 2 / ^TT^dv } dx < +oo. 



and 

r+oo 

(^i^y I Jo (^{vf 

If the condition TZV is satisfied then the weak solution of (^, {Xt, t > 0}, has the ergodic 
property (see for example Gikhman and Skorohod, 1972), that is, there exists an invariant 
probability measure fis such that for every measurable function g such that E5|(7(^)| < oo, 
we have with probability one, 

T^i^^ C 9{Xt)dt= [ giz)fs{z)dz = Bs{giO) 
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where ^ has the invariant measure as distribution, denote the mathematical expecta- 
tion with respect to /i5, and fs is the invariant density given by 

Suppose we observe different diffusion processes {Xt : < t < T} given by equation ((H) 
with drift coefficients respectively given by 5*1, 5*2 and 5*0 = and initial value respectively 
Xq, Xq and Xq. Let us introduce the following condition. 

SAi. The functions Si, S2 and a satisfy condition £S and the densities (with respect 
to the Lehesgue measure) of the corresponding initial values Xq, Xq and Xq have the same 
support (if the initial value is nonrandom, then we suppose that it takes the same value 
for all processes). 

If condition SAi holds true, all the measures Pj, for different S, induced by the 
process {Xj, : < t < T}, in the space Ct, the space of all the continuos function on [0, T] 
with uniform metric and Borel cr-algebra B{Ct), are equivalent. See Kutoyants, 2004, p. 
34 and the references therein. 



3 The asymptotic global bound 

Given the diffusion process (0) we suppose that conditions SS and TZV are satisfied, 
that Xq has density fs, given by ((2|), so the process {Xt,t > 0} is ergodic and strictly 
stationary. We are interested in the estimation of the invariant distribution function 

by the observation X'^ = {Xt : < t < T} solution of when a is known and S is 
unknown. Let us denote by the mathematical expectation with respect to the measure 

■ 

Let Ft{x) be any estimator of Q for a; G M. We define the integrated mean square 
error as 

Pt(Ft,F5) =TE^ / |FT(a;) -F5(x)|V(dx) (4) 
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where is a finite measure on R. We will refer also to it as global risk. A natural estimator 
of Fs{x) for a; e R is the empirical distribution function defined as follows 



1 '•^ 



Ft{x) = - / X{^^<,jdi, xeR. (5) 

- ' 



This estimator is uniformly consistent, asymptotically normal and asymptotically efficient 

in the sense that the empirical distribution function achieves a local asymptotic minimax 

lower bound for the integrated mean square error of an arbitrary estimator. For a fixed 

function a let us introduce the classes S„ = {S : conditions SS,SM.,71V are fulfilled} 

and S* C Sa such that for every in S* there exist a S > 0, and a vicinity Vs — 

{S : supj.g]g \S^{x) — S{x)\ < S, S E S*} , such that sup^^y^ G{S) < +oo. 

For X and y in R we denote by x Ay and hy xV y respectively the minimum and the 

maximum between x and y. Let us introduce the function 

^ , f^^ Fs{vAx){l-Fs{vyx))FsivAy){l-Fs{v\/y))^ 
Rs{x, y)^4: ^ f \2f f \ ^ 

and the quantity 

Let us introduce the following condition. 

Qi. The function e S* and for some 5 > 



/.T, f Fs{x)Fs{0-Fs{^Ax) y ... ^ 
sup p^{S) = sup / 4E5 77VT7r\ ^(^^) < +°°- 



We have the following result (Kutoyants and Negri, 2001). 
Theorem 1. Let e S* and condition Qi he fulfilled. Then 

lim lim inf sup Pt{Ft, Fs) > p*{S*) 

5-»0 T^oo Ft SeVg 

where the inf is taken over any estimator Ft of Fs. 

The definition of asymptotically efficient estimator arise naturally from the above theorem. 

Definition 1. Let condition Qi be fulfilled. Then an estimator Ft is called asymptotically 
efficient if for any S^. e S* we have 

lim lim sup pt{Ft, Fg) = p^S*) (6) 

5_»0 T-»oo SeVs 
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Put 



y Fs(vAx)-Fs(v)Fs(x) 



Jo (^{vrfsiv) 
Let us introduce the following condition. 
Q2- Suppose that 

sup / EsH^,s{0^iy{dx) < +00. 

The following theorem establishes the asymptotic efficiency of the empirical distribution 
function (0) in the sense of equality (0)- 

Theorem 2. Let conditions Qi, Q2 hold and p^.{S) be continuous in the uniform topology 
at the point S* , then the empirical distribution function is asymptotically efficient. 

It is proved in Kutoyants and Negri (2001). 

4 A class of unbiased estimators 

In this section we consider a class of estimators of Fs{x) recently introduced in Kutoyants 
(2004) defined, for x G M as 

Frix) = RAXt)dX, + iV.(X,)dt (8) 

where R^{y) = 2x{y<x}K^{y)h{y), N^{y) = X{y<x}K^{y)h'{y)a^{y), K^{y) = Jy ^rj^j^ 
and his a positive and continuously differentiable function. It can be proved that these es- 
timators, for different functions Rx and are all unbiased, consistent and asymptotically 
normal for a fixed x. Let us suppose the following conditions hold: 

EsiRAO^iO? < +00, E5|iV,(0|<+oo, lim RMa{yffs{y) = (9) 

y^-co 

We have the following result (Kutoyants, 2004). 

Theorem 3. Let S G S^, Rs{x,x) < +00, and conditions Q be fulfilled. Then the 
estimator Ft{x) is unbiased, consistent and asymptotically normal with variance given by 
Rsix,x). 
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Let us define the following function 

GM = 2 r X{.<.}K,iv)hiv)dv (10) 
Jo 

We introduce the following condition. 
Q3. Suppose that 

sup / EsGxiO'^i^idx) < +00. 

The following theorem proves that the new class of estimators defined by (jH)) is asymp- 
totically efficient. 

Theorem 4. Let conditions Qi, Q2 and Q3 hold true. Let p*(5') be continuous in the 
uniform topology at the point S* , then 

lim lim sup pt{Ft,Fs) = p*iS*) (11) 

(5^0 T^oo SeVs 



Proof. Let us define c^{y) = X{y<x}Kx{y)[2h{y)S{y) + h'{y)a'^{y)] - Fs{x) and 4(2/) = 
2x{y<x}h{y) Kx{y)<7 (y) . We have the following representation of the empirical process 

Vf (^Ft{x) - Fsix)) = ^ £c.{X,)dt + ^ £ d,iX,)dWt. (12) 

Now we search for a function Mx^s such that 

MUy)Siy) + \Ml,{y)a\y) = cM- (13) 

Putting M'^ g = m, the equation (|T!?|) can be rewritten as m' = || — which have 
solution 



= f( \ 2( ^ / Cx{v)f{v)dv. (14) 
f[z)a\z) 



Integrating by part the integral in ()14|1 and observing that 

^<y\y)f{y) = 2S{y)f{y) 

the function m can be rewritten as 

w N . N Fix^z) -F(x)F(z) , , 

m{z) = 2x{.<x}h{z)Kx{z) + 2^ ^ ^ \ (15) 



5 EXAMPLES 



Choosing the function Mj.^s such that M^. 5(0) = 0, it has this form 

M.,s{y) = f 2x{.<.}h{z)K^{z)dz + r 2 ^(^^^) -/(^)-^(^) d;, (16) 
Jo Jo {z)f{z) 

From (jZj), (fnH) and (fTH|l we can write 

M,,s{y) = GM + H,,s{y)- (17) 
By the Ito formula we can write 

dM,,siXt) = (m;5(X,)5(X,) + ^M'lsiXt)a'iX,)^ dt + M;^(X,)o-(Xi)diy,. 

Now we can substitute the Lebesgue integral in (|T^ by means of this formula and the 
empirical process (fT2|) became 



. ^ .A M^5(Xt)-M^5(Xo) 1 Fs(xAXt)-Fs(x)Fs(Xt) , 
From (fTTj) and conditions Q2 and Q3 it follows that 

/ M.,5(Xt)-M.,5(Xo) V 

lim sup / Eq ^ ^ — ■ z/ dx =0 1 



Moreover by the continuity of p*(5') at the point S*, as in Kutoyants and Negri (2001) 
we can conclude 

So by (fT^ and (fTTHl it follows (fTT|l . and the proof is concluded. □ 



5 Examples 

In Theorem IHl and Theorem IH conditions Qi and Q2 involve the function S, indeed the 
model. Example of function S for which conditions Qi and Q2 are satisfied can be found 
in Kutoyants and Negri (2001) and in Kutoyants (2004). Conditions (jHl) in Theorem Inland 
Qs in Theorem m are on the class of estimators. The class of unbiased estimators defined 
by (jHl) is very general and is not empty. In this section we will show that for a very large 
choice of functions h{-) the related estimator F satisfy condition Q3 and conditions 0. 
In all this section let us consider for simplicity (y{u) = 1. 
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Example 1. Let us consider h{u) = 1 + m^^, p > 1. We have 

^-(^) = [ = ^^(^) - ^p^y^ (20) 

where Kp{-) is a primitive of l/h{-). We have that for every p > 1, \Kx{y)\ < tt. From 
m we have \RM\ = \^X{y<,yKMh{y)\ < 27:\h{y)\ and Es{Rx{Of < 27r^(l + 
y'^^)fs{y)dy < +00 because the invariant distribution admits the moment of every or- 
der. Moreover N^i^y) = Xs ^ i-^x(l/)2p?/^^~^ and we have E5(|A^j;(^)|) < +oo. Finally 
limy^-oo Rx{y)fs{y) = and conditions are satisfied. Let now consider Gx{y) = 
Rx{v)dv. We have 

/+00 / ry \ 2 

^ R,{v)dvj fs{y)dy<C (21) 

where C is a constant that does not depend on S E Vs- Equation (PTj) imply that condition 
Qs is satisfied without any further condition on measure u. Note that for p = 1, the ^K)^ 
became Kx{v) = = artgx — artgf < n for every v and x. 

Example 2. Let us consider h{u) = e^^, S > 0. We have 

, , r du e^^y - e-^"" 

= -s^ = 7 



y 



and 



g5(j/-a;) 

R^(y) = -^x^y<x}^- (22) 



From ()22|) it follows that Rx{y) is bounded with respect to both the variables x and y. 
In virtue of this fact conditions (jHl) and Q3 can easily be check. Also in this case we have 
not to set any further condition on measure u. 

Example 3. Let us consider h{u) = c, where c is a real constant. We have Kx{v) = 
Rx{v) = Ix^^J^x — v) and Nx{v) = 0. In this case the first of the conditions 
dni) is not satisfied. In fact it follows that EsiRxiOY = 4x^Fs(x) + C, where C is a 
constant depending on the moment values of the invariant distribution. For each fixed 
X, Eis{Rx{OY is finite but when x +00, also Es^RxiOY S^^s to infinity. In any case 
we have Gx{y) = 2a;(?/ /\ x) — {y /\ xY and EsiG x{iY)i'{dx) is finite if the measure v 
admits the moment of order fourth. 
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We observe that the empirical distribution function does not belong to this class of 
estimators. Indeed it would be necessary that Rx{y) = and Kx{y)h{y) = 1 for every y 
and X belonging to TZ. But if such a function h exists it has to depend on x and this is 
not possible. 
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